El presente trabajo se propone estudiar un conjunto de datos con los incidentes atendidos por el departamento de bomberos de una gran urbe contemporánea para ver hasta qué punto estos se adecuan a la idea inicial que podríamos de tener. A saber, que en su mayoría son eventos excepcionales y por tanto es tentador pensar que son bien modelados por la distribución del títutlo.
Siguiente el consejo de la docente, añadimos también el estudio a través de la distribución binomial negativa, que en su concepción heurística también se plantea como ideal para modelar esta clase de sucesos.
Cargamos algunas librerías útiles:
library(MASS, pos = "package:base") # WARNING: Import before tidyverse to keep dplyr's `select`
library(tidyverse)
library(lubridate, pos = "package:base")
library(glue)
library(here)
library(knitr)
Leemos el dataset, bajado de acá:
read_dataset <- function(fn) {
read_csv(fn) %>%
rename(
incident_type = CFD_INCIDENT_TYPE_GROUP,
incident_time = CREATE_TIME_INCIDENT) %>%
# Reemplazamos NA con "NULL" para poder seleccionar los incident_type con
# valor faltante más tarde:
mutate(incident_type = map_chr(incident_type,
~ifelse(is.na(.x), "NULL", .x)))
}
count_incidents_per_type <- function(df) {
df %>% count(incident_type) %>% arrange(-n)
}
fn <- here("data/Cincinnati_Fire_Incidents__CAD___including_EMS__ALS_BLS_.csv")
full_incidents <- read_dataset(fn)
## Parsed with column specification:
## cols(
## ADDRESS_X = col_character(),
## LATITUDE_X = col_double(),
## LONGITUDE_X = col_double(),
## AGENCY = col_character(),
## CREATE_TIME_INCIDENT = col_character(),
## DISPOSITION_TEXT = col_character(),
## EVENT_NUMBER = col_character(),
## INCIDENT_TYPE_ID = col_character(),
## INCIDENT_TYPE_DESC = col_character(),
## NEIGHBORHOOD = col_character(),
## ARRIVAL_TIME_PRIMARY_UNIT = col_character(),
## BEAT = col_character(),
## CLOSED_TIME_INCIDENT = col_character(),
## DISPATCH_TIME_PRIMARY_UNIT = col_character(),
## CFD_INCIDENT_TYPE = col_character(),
## CFD_INCIDENT_TYPE_GROUP = col_character(),
## COMMUNITY_COUNCIL_NEIGHBORHOOD = col_character()
## )
n_per_incident_type <- count_incidents_per_type(full_incidents)
# Uncomment to print the list of incident types and their frequency:
# kable(n_per_incident_type)
Lo que nos interesa es separar las llamadas según su tipo y ver la frecuencia de cada tipo, para eso por lo pronto las contamos, considerando que el período de los registros abarca unos tres años.
| incident_type | n |
|---|---|
| SICK PERSON | 30798 |
| BREATHING PROBLEMS | 24009 |
| FALLS | 22481 |
| PERSON DOWN | 19529 |
| AUTOMATIC FIRE ALARM | 19507 |
| MEDICAL EMERGENCY | 14627 |
| CHEST PAIN / CHEST DISCOMFORT (NON-TRAUMATIC) | 14327 |
| INFORMATION TELETYPE-NO DISPATCH | 12682 |
| UNCONSCIOUS / FAINTING (NEAR) | 10468 |
| CONVULSIONS / SEIZURES | 9019 |
| UNKNOWN PROBLEM (PERSON DOWN) | 8358 |
| ABDOMINAL PAIN / PROBLEMS | 7462 |
| ACCIDENT WITH INJURY - FIRE ONLY | 7324 |
| HEROIN OVERDOSE | 7290 |
| HEMORRHAGE / LACERATIONS | 7150 |
| ASSAULT WITH INJURY | 6224 |
| DIABETIC PROBLEMS | 5205 |
| ACCIDENT WITH INJURY | 4329 |
| STRUCTURE FIRE | 4317 |
| NULL | 4054 |
| OUTDOOR FIRE | 3792 |
| DETAIL/SPECIAL ASSIGNMENT | 3528 |
| STROKE (CVA) / TRANSIENT ISCHEMIC ATTACK (TIA) | 3313 |
| PREGNANCY / CHILDBIRTH / MISCARRIAGE | 3200 |
| TRAUMATIC INJURIES (SPECIFIC) | 3095 |
| HEART PROBLEM / A.I.C.D | 2709 |
| OVERDOSE / POSIONING (INGESTION) | 2620 |
| SUICIDE ATTEMPT | 2611 |
| FUMES | 2526 |
| SALVAGE | 2514 |
keep_incidents_of_type <- function(chosen_type) {
full_incidents %>%
select(incident_type, incident_time) %>%
filter(incident_type == chosen_type) %>%
select(incident_time) %>%
mutate(incident_time = as.POSIXct(incident_time,
format = "%m/%d/%Y %I:%M:%S %p",
tz = "UTC")) %>%
mutate(incident_day = floor_date(incident_time, "day")) -> individual_incidents_table
return(
list(table=individual_incidents_table, chosen_type=chosen_type)
)
}
# Esta función toma la tabla donde hay una entrada y fecha (redondeada) por evento y agrupa los eventos que ocurrieron en el mismo día. Además, añade los días de frecuencia 0 hallados a través de una simple diferencia entre las fechas extremas y la cantidad de días en la lista ingresada.
get_freq_by_date_table <- function(kept_incidents) {
daily_incidents_count <-
kept_incidents$table %>%
count(incident_day) %>%
select(n)
# Agregamos los días sin llamadas (días que no aparecen en el dataset)
# como filas de ceros:
max_day <- max(kept_incidents$table$incident_day)
min_day <- min(kept_incidents$table$incident_day)
total_days <- max(as.integer(max_day - min_day), nrow(daily_incidents_count))
missing_days <- total_days - nrow(daily_incidents_count)
daily_incidents_count <-
rbind(tibble(n = rep(0, missing_days)), daily_incidents_count) %>%
mutate(n = as.integer(n))
# Frecuencia de frecuencias: contamos cuántos días tienen 0 llamadas,
# cuántos días tienen 1 llamada, ...
daily_incidents_freq <-
daily_incidents_count %>%
count(n) %>%
rename(count = nn) %>%
mutate(
density = count / nrow(daily_incidents_count),
density_type = "empirical") %>%
select(n, density_type, density)
# Safety check:
assertthat::assert_that((near(sum(daily_incidents_freq$density), 1)))
return( list(
freq=daily_incidents_freq,
count=daily_incidents_count
)
)
}
build_chosen_incident_dataframe <- function(chosen_type) {
frame_by_date <- keep_incidents_of_type(chosen_type)
frame_by_freq <- get_freq_by_date_table(frame_by_date)
# Asumiendo distribución Poisson, estimamos el parámetro como la media muestral:
frame_by_freq["lambda"] <- mean(frame_by_freq$count$n)
return( frame_by_freq )
}
compare_poisson_and_bn_for_incident_type <- function(chosen_incident_type, graphic = T) {
chosen_incident_dataframe <- build_chosen_incident_dataframe(chosen_incident_type)
daily_incidents_count <- chosen_incident_dataframe$count
daily_incidents_freq <- chosen_incident_dataframe$freq
lambda_estimate <- chosen_incident_dataframe$lambda
# Ajustamos el número de incidentes diario con la distribución Binomial Negativa
# para comparar con la Poisson:
fit <- fitdistr(daily_incidents_count$n, densfun = "negative binomial")
bn_size_estimate <- fit$estimate[["size"]]
# Obs: El parámetro "mu" de la parametrización alternativa de la distribución
# BN se estima igual que el lambda de Poisson, con la media muestral.
bn_mu_estimate <- lambda_estimate
max_n <- max(daily_incidents_count$n)
min_n <- min(daily_incidents_count$n)
# Juntamos las densidades esperadas por Poisson y BN para cada n observado
# para luego comparar con la densidad empírica:
compute_density <- function(n, density_type, density_function) {
if (density_type == "poisson") {
density_function(n, lambda = lambda_estimate)
} else if (density_type == "negative_binomial") {
density_function(n, size = bn_size_estimate, mu = bn_mu_estimate)
} else {
stop(glue("I don't know how to compute density for type: {density_type}"))
}
}
probabilities_per_n <-
cross_df(list(
n = seq(min_n, max_n),
density_type = c("poisson", "negative_binomial"))) %>%
mutate(
density_function = map(density_type,
~ifelse(.x == "poisson", dpois, dnbinom)),
density = pmap_dbl(list(n, density_type, density_function),
compute_density)) %>%
select(n, density_type, density)
probabilities_per_n <- rbind(probabilities_per_n, daily_incidents_freq)
# Sumamos las curvas de densidad Poisson y BN al gráfico de la empírica:
poisson_color <- "ForestGreen"
bn_color <- "IndianRed"
empirical_color <- "#444444"
if (graphic == T ) {
fig <-
probabilities_per_n %>%
ggplot() +
aes(x = n, y = density, color = density_type) +
geom_point(alpha = .7) +
geom_line() +
scale_color_manual(values = c(empirical_color, bn_color, poisson_color)) +
labs(
title = glue("Número de '{chosen_incident_type}' por día"),
subtitle = "Ajuste de la distribución empírica con Poisson y Binomial Negativa",
y = "Densidad",
x = glue("Número de llamadas"))
return (list(
probabilities_per_n = probabilities_per_n,
fig = fig,
lambda_estimate = lambda_estimate
))
}
return (list(
probabilities_per_n = probabilities_per_n,
lambda_estimate = lambda_estimate
))
}
### Para generar los gráficos, cambiar la variable fija "generate_graphics"
distribuciones_de_incidentes_largas <- list()
for (incident_type in n_per_incident_type$incident_type) {
generate_graphics <- T
print(glue("Working with: {incident_type}"))
comparison <- compare_poisson_and_bn_for_incident_type(incident_type, generate_graphics)
# Sanitize the incident type for the plot filename:
name <- str_to_lower(incident_type)
name <- str_replace_all(name, "/", "-")
distribuciones_de_incidentes_largas[[name]] <- list(nombre = name, tabla = comparison$probabilities_per_n, lambda = comparison$lambda_estimate)
if (generate_graphics) {
filename <- here(glue("results/{name}.png"))
ggsave(filename, comparison$fig, width = 10, height = 6)
print(comparison$fig)
}
}
## Working with: SICK PERSON
## Working with: BREATHING PROBLEMS
## Working with: FALLS
## Working with: PERSON DOWN
## Working with: AUTOMATIC FIRE ALARM
## Working with: MEDICAL EMERGENCY
## Working with: CHEST PAIN / CHEST DISCOMFORT (NON-TRAUMATIC)
## Working with: INFORMATION TELETYPE-NO DISPATCH
## Working with: UNCONSCIOUS / FAINTING (NEAR)
## Working with: CONVULSIONS / SEIZURES
## Working with: UNKNOWN PROBLEM (PERSON DOWN)
## Working with: ABDOMINAL PAIN / PROBLEMS
## Working with: ACCIDENT WITH INJURY - FIRE ONLY
## Working with: HEROIN OVERDOSE
## Working with: HEMORRHAGE / LACERATIONS
## Working with: ASSAULT WITH INJURY
## Working with: DIABETIC PROBLEMS
## Working with: ACCIDENT WITH INJURY
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: STRUCTURE FIRE
## Working with: NULL
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: OUTDOOR FIRE
## Working with: DETAIL/SPECIAL ASSIGNMENT
## Working with: STROKE (CVA) / TRANSIENT ISCHEMIC ATTACK (TIA)
## Working with: PREGNANCY / CHILDBIRTH / MISCARRIAGE
## Working with: TRAUMATIC INJURIES (SPECIFIC)
## Working with: HEART PROBLEM / A.I.C.D
## Working with: OVERDOSE / POSIONING (INGESTION)
## Working with: SUICIDE ATTEMPT
## Working with: FUMES
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: SALVAGE
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: FIRE ADVISED - NO DISPATCH
## Working with: WIRES DOWN/ARCING
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: BACK PAIN (NON-TRAUMATIC OR NON-RECENT TRAUMA)
## Working with: FIRE DRILL - NO RESPONSE
## Working with: SHOOTING HAS OCCURRED
## Working with: VEHICLE FIRE
## Working with: LOCK IN/LOCK OUT
## Working with: CARDIAC OR RESPIRATORY ARREST / DEATH
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: HEADACHE
## Working with: WALKIN AT FIRE STATIOIN
## Working with: ALLERGIES (REASTIONS) / ENVENOMATIONS (STINGS, BITES)
## Working with: CARBON MONOXIDE ALARM
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: ENTRAPMENT - AUTO ACCIDENT
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: STUCK IN ELEVATOR
## Working with: STILL ALARM
## Working with: POSSIBLE DOA
## Working with: CHEST PAIN/CHEST DISCOMFORT (NON-TRAUMATIC)
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: INVESTIGATION
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: DOMESTIC VIOLENCE WITH INJURY
## Working with: FIRE HYDRANT LEAKING OR STRUCK
## Working with: TRAFFIC / TRANSPORTATION INCIDENTS
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: NOT USED
## Working with: AUTO ACCIDENT - CAR HIT BUILDING
## Working with: CUTTING - FIRE ONLY
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: ROBBERY WITH INJURY
## Working with: CHOKING
## Working with: TASING - POLICE REQUEST
## Working with: CUTTING
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: ASSAULT / SEXUAL ASSAULT / STUN GUN
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: GAS SPILL - MINOR < 25 GALLONS
## Working with: ANIMAL BITES / ATTACKS
## Working with: EYE PROBLEMS / INJURIES
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: FIRE REPORTED OUT
## Working with: SWAT OPERATION
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: BURNS (SCALDS) / EXPLOSION (BLAST)
## Working with: OUTLET SPARKING
## Working with: FIRE EQUIPMENT INVOLVED IN ACCIDENT
## Working with: POLICE OFFICER NEEDS ASSISTANCE
## Working with: CARBON MONOXIDE ILL/WITH SYMPTOMS
## Working with: HEAT / COLD EXPOSURE
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: BOMB REMOVAL
## Working with: RIVER EMERGENCY
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: TRUCK FIRE
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: ENTRAPMENT - MECHANICAL/FIRE ONLY
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: MUTUAL AID REQEST
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: PSYCHIATRIC / ABNORMAL BEHAVIOR / SUICIDE ATTEMPT
## Working with: STAB / GUNSHOT / PENETRATING TRAUMA
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: TEST
## Working with: INACTIVITY ALARM
## Working with: RAPE WITH INJURY
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: INJURY - POLICE REQUEST
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: ELECTROCUTION / LIGHTNING
## Working with: FIREFIGHTER NEEDS ASSISTANCE
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: RAPE
## Working with: CHECMICAL INVESTIGATION
## Working with: STRUCTURAL OR TRENCH COLLAPSE
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: AIR CRAFT IN TROUBLE OR DOWN
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: GAS SPILL - MAJOR > 25 GALLONS
## Working with: BIOLOGICAL/CHEMCIAL THREAT
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: DROWNING - FIRE ONLY
## Working with: DROWNING - LARGE BODY OF WATER
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: INACCESSIBLE INCIDENT / OTHER ENTRAPMENTS (NON-TRAFFIC)
## Working with: DROWNING / NEAR DROWNING / DIVING / SCUBA ACCIDENT
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: CHEMICAL SPILL
## Working with: HIGH RISK SEARCH WARRANT
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: CARBON MONOXIDE /INHALATION / HAZMAT / CBRN
## Working with: TRANSFER CALL
## Working with: WATER RESCUE TRIBUTARY
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: GREATER CINCINNATI AIRPORT CRASH
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Working with: CARDIAC OR RESPIRATORY ARREST/DEATH
## Working with: BOAT FIRE ON RIVER
## Working with: FIRE SERVICE - NO DISPATCH/ NOT USED
## Working with: BIOHAZARD ALARM AT POST OFFICE
## Working with: HEMORRHAGE/LACERATIONS
saveRDS(distribuciones_de_incidentes_largas,"incidentes.rds")
En realidad para ser totalmente confiable este estudio necesita algún tipo de bootsrap como k-pliegues para evitar un sesgo hacia la confirmación pero consideramos que para los fines de este vistazo no es necesario.
distribuciones_de_incidentes_largas %>%
map(~.$tabla) %>%
map(~spread(.,density_type,density)) %>%
map(
~mutate( . , error_p = (poisson - empirical)^2,
error_bn = (negative_binomial - empirical)^2,
)
) -> distribuciones_de_incidentes
obtener_varianzas <- function(tabla) {
v_poisson <- sum(tabla$error_p)/nrow(tabla)
v_binomial_neg <- sum(tabla$error_bn)/nrow(tabla)
ganadora <- ifelse(v_poisson <= v_binomial_neg,"poisson","binomial negativa : (")
return(list(ganadora=ganadora,varianza_poisson=v_poisson, varianza_binomial_negativa=v_binomial_neg))
}
distribuciones_de_incidentes %>% map(obtener_varianzas) -> varianza_por_caso
son_poisson <- which(map( varianza_por_caso, ~.$ganadora == "poisson" ) == 1 )
son_pascal <- which(map( varianza_por_caso, ~.$ganadora == "binomial negativa : (" ) == 1 )
Los eventos que tienen un comportamiento mejor descrito por la distribución de Poisson son los siguientes:
| x | |
|---|---|
| pregnancy - childbirth - miscarriage | 24 |
| back pain (non-traumatic or non-recent trauma) | 33 |
| allergies (reastions) - envenomations (stings, bites) | 41 |
| choking | 56 |
| fire reported out | 63 |
| burns (scalds) - explosion (blast) | 65 |
| carbon monoxide ill-with symptoms | 69 |
| psychiatric - abnormal behavior - suicide attempt | 76 |
| electrocution - lightning | 82 |
| rape | 84 |
| checmical investigation | 85 |
| chemical spill | 94 |
| carbon monoxide -inhalation - hazmat - cbrn | 96 |
| cardiac or respiratory arrest-death | 100 |
| boat fire on river | 101 |
| fire service - no dispatch- not used | 102 |
| biohazard alarm at post office | 103 |
| hemorrhage-lacerations | 104 |
Muchos más eventos tuvieron distribución binomial negativa (o fueron mejor explicados por ella)
| x | |
|---|---|
| falls | 3 |
| chest pain - chest discomfort (non-traumatic) | 7 |
| unconscious - fainting (near) | 9 |
| unknown problem (person down) | 11 |
| hemorrhage - lacerations | 15 |
| diabetic problems | 17 |
| stroke (cva) - transient ischemic attack (tia) | 23 |
| traumatic injuries (specific) | 25 |
| heart problem - a.i.c.d | 26 |
| overdose - posioning (ingestion) | 27 |
| suicide attempt | 28 |
| shooting has occurred | 35 |
| cardiac or respiratory arrest - death | 38 |
| walkin at fire statioin | 40 |
| carbon monoxide alarm | 42 |
| stuck in elevator | 44 |
| possible doa | 46 |
| chest pain-chest discomfort (non-traumatic) | 47 |
| investigation | 48 |
| domestic violence with injury | 49 |
| traffic - transportation incidents | 51 |
| not used | 52 |
| auto accident - car hit building | 53 |
| cutting - fire only | 54 |
| tasing - police request | 57 |
| gas spill - minor < 25 gallons | 60 |
| animal bites - attacks | 61 |
| eye problems - injuries | 62 |
| outlet sparking | 66 |
| fire equipment involved in accident | 67 |
| police officer needs assistance | 68 |
| heat - cold exposure | 70 |
| bomb removal | 71 |
| river emergency | 72 |
| truck fire | 73 |
| mutual aid reqest | 75 |
| stab - gunshot - penetrating trauma | 77 |
| inactivity alarm | 79 |
| rape with injury | 80 |
| injury - police request | 81 |
| firefighter needs assistance | 83 |
| structural or trench collapse | 86 |
| air craft in trouble or down | 87 |
| gas spill - major > 25 gallons | 88 |
| biological-chemcial threat | 89 |
| drowning - fire only | 90 |
| drowning - large body of water | 91 |
| inaccessible incident - other entrapments (non-traffic) | 92 |
| drowning - near drowning - diving - scuba accident | 93 |
| high risk search warrant | 95 |
| transfer call | 97 |
| water rescue tributary | 98 |
distribuciones_de_incidentes_largas %>% map_dbl(~.$lambda) -> vector_lambdas
lambdas <- tibble(nombre = names(vector_lambdas),lambda=unname(vector_lambdas))
lambdas %>% ggplot() +
aes(x=lambda) +
geom_histogram(binwidth=1)
lambdas %>% mutate( distribucion = rep("empírica") ) %>% select (lambda,distribucion) -> lambdas_larga
min_lambda <- min(lambdas_larga$lambda)
max_lambda <- max(lambdas_larga$lambda)
nro_de_intervalos <- 23
lambdas_larga %>% mutate(Intervalo = cut(lambda,breaks= nro_de_intervalos,include.lowest=F, ordered = T ) ) %>%
group_by(Intervalo) %>%
summarise(Frecuencia_emp = n()/nrow(lambdas_larga) ) -> lambdas_freq
#Necesitamos obtener los valores centrales de cada intervalo. Esto es sorprendentemente trabajoso.
cbind(inferior = as.numeric( sub("\\((.+),.*", "\\1", lambdas_freq$Intervalo) ),
superior = as.numeric( sub("[^,]*,([^]]*)\\]", "\\1", lambdas_freq$Intervalo) )) %>%
as.tibble -> extremos
lambdas_agrupada <- bind_cols(lambdas_freq,extremos)
lambdas_agrupada %>% mutate(lambda_centro=(superior-inferior)/2+inferior) -> lambdas_agrupada
distribucion_lambda <- fitdistr(lambdas$lambda,"exponential")
distribucion_gamma <- fitdistr(lambdas$lambda,"gamma")
forma_g <- distribucion_gamma$estimate[1]
tasa_g <- distribucion_gamma$estimate[2]
lambdas_agrupada %>% mutate(
Frecuencia_estimada_exp = dexp(lambda_centro,rate=distribucion_lambda$estimate),
Frecuencia_estimada_gamma = dgamma(x=lambda_centro,shape=forma_g,rate=tasa_g)) -> lambdas_agrupada
lambdas_larga <- gather(lambdas_agrupada,Frecuencia_emp,Frecuencia_estimada_exp,Frecuencia_estimada_gamma,key="Origen",value="Frec")
ggplot(data = lambdas_larga, mapping = aes(x=lambda_centro,y=Frec, color = Origen )) +
geom_line()
Veamos la consistencia del estimador. Construimos una sucesion de estimadores de lambda con los n primeros valores de la muestra para n desde 20 hasta el tamaño total de la muestra. Comparamos esos valores a la estimacion de lambda sobre toda la muestra, tomada como el valor real del párametro.
Elegimos el más frecuente de los eventos con comportamiento Poisson: hechos relacionados a partos.
evento <- n_per_incident_type[24,]$incident_type
daily_incidents <-
keep_incidents_of_type(evento)$table %>%
count(incident_day) %>%
select(n)
set.seed(42)
muestra<-sample(daily_incidents$n,length(daily_incidents$n),replace = FALSE)
delta_values<-c()
for (i in 20:length(daily_incidents$n)){
delta_values<-c(delta_values,mean(mean(muestra[1:i])))
}
lambda_est <- fitdistr(daily_incidents$n,"poisson")
lambda_est <- unname(lambda_est[[1]])
ggplot()+
geom_point(aes(x=20:length(daily_incidents$n),y=delta_values,colour="Sucesion de estimadores"))+
geom_hline(yintercept=lambda_est,colour="blue")+
labs(title="Consistencia del estimador", x="n", y="Estimaciones")+
theme(legend.position = "none")
Estudiemos la distribucion del estimador. Generamos por bootstrap una muestra de N estimaciones del parametro de Poisson
dist_estimador_boot<-function(N){
set.seed(42)
estim_values<-c()
for (i in 1:N){
muestra<-sample(daily_incidents$n,length(daily_incidents$n),replace=TRUE)
estim_values<-c(estim_values,mean(muestra))
}
return(estim_values)
}
muestra_estim<-dist_estimador_boot(10000)
fig<-ggplot()+
geom_density(aes(x=muestra_estim),color="RoyalBlue")+
labs(title = paste("Funcion de probabilidad de la muestra de estimador de Poisson con N=",10000),x="")
print(fig)
La esperanza y la varianza del estimador estimadas sobre la muestra de valores obtenidas por Bootstrap son:
paste("Esperanza=",mean(muestra_estim))
## [1] "Esperanza= 2.55814052757794"
paste("Estimacion de lambda:",lambda_est)
## [1] "Estimacion de lambda: 2.55795363709033"
paste("Varianza=",var(muestra_estim))
## [1] "Varianza= 0.00154707891916079"
paste("Cota de Rao-Cramer calculada con la estimacion de lambda=",lambda_est/length(daily_incidents$n))
## [1] "Cota de Rao-Cramer calculada con la estimacion de lambda= 0.00204472712796989"
Si la estimacion de lambda es correcta, se ve que el estimador es insesgado y alcanza la cota de Rao Cramer. Se ve IMVU.
Veamos si el estimador es asintóticamente normal y eficiente. Comparamos la distribución empírica del estimador a una normal cuya esperanza es la estimacion de lambda y cuya varianza es la cota de Rao Cramer calculada con la estimación de lambda.
dist_asint<-function(N){
estim_values<-dist_estimador_boot(N)
fig<-ggplot()+
geom_density(aes(x=estim_values,color="Empírica"))+
stat_function(aes(x=c(min(estim_values),max(estim_values)),color="Normal"),fun=dnorm,args=list(mean=lambda_est,sd=sqrt(lambda_est/length(daily_incidents$n))))+
labs(title = paste("Densidad empírica del estimador con N=",N,"y la normal"), color="Densidad",x="")
return(fig)
}
print(dist_asint(1000))
print(dist_asint(10000))
print(dist_asint(50000))
El estimador se ve asintóticamente normal y eficiente.